The performance of the WHO COVID-19 severity classification, COVID-GRAM, VACO Index, 4C Mortality, and CURB-65 prognostic scores in hospitalized COVID-19 patients: data on 4014 patients from a tertiary center registry

Aim To evaluate the predictive properties of several common prognostic scores regarding survival outcomes in hospitalized COVID-19 patients. Methods We retrospectively reviewed the medical records of 4014 consecutive COVID-19 patients hospitalized in our tertiary level institution from March 2020 to March 2021. Prognostic properties of the WHO COVID-19 severity classification, COVID-GRAM, Veterans Health Administration COVID-19 (VACO) Index, 4C Mortality Score, and CURB-65 score regarding 30-day mortality, in-hospital mortality, presence of severe or critical disease on admission, need for an intensive care unit treatment, and mechanical ventilation during hospitalization were evaluated. Results All of the investigated prognostic scores significantly distinguished between groups of patients with different 30-day mortality. The CURB-65 and 4C Mortality Score had the best prognostic properties for prediction of 30-day mortality (area under the curve [AUC] 0.761 for both) and in-hospital mortality (AUC 0.757 and 0.762, respectively). The 4C Mortality Score and COVID-GRAM best predicted the presence of severe or critical disease (AUC 0.785 and 0.717, respectively). In the multivariate analysis evaluating 30-day mortality, all scores mutually independently provided additional prognostic information, except the VACO Index, whose prognostic properties were redundant. Conclusion Complex prognostic scores based on many parameters and comorbid conditions did not have better prognostic properties regarding survival outcomes than a simple CURB-65 prognostic score. CURB-65 also provides the largest number of prognostic categories (five), allowing more precise risk stratification than other prognostic scores.

Before the start of vaccination, COVID-19 was characterized by a high proportion of patients developing respiratory insufficiency and requiring hospital admission. Vaccination successfully reduced the number of severe/critical patients and resulted in an improved clinical course of the disease (1). Nevertheless, vaccine hesitancy and waning immunity remain obstacles to a successful vaccination program (2,3). In patients developing severe/critical COVID-19, progressive respiratory deterioration is usually accompanied by a disorder in a number of other affected organ systems, such as the circulatory, hepatobiliary, and central nervous systems (4)(5)(6)(7). The presence of comorbidities (8) and the severity of inflammatory process (9) are important predictors of unfavorable outcomes. A number of prognostic systems developed before and during the pandemic have been investigated in COVID-19 patients (10). These scores estimate respiratory, inflammatory, and comorbidity status, and perform differently in various cohorts of patients. Due to uncertainties regarding the prognostication of hospitalized patients with severe or critical presentation of COVID-19, we aimed to evaluate several common prognostic scores in patients from our tertiary level institution.

PATIeNTS AND MeThoDS
We retrospectively evaluated the electronic and paper medical records of 4014 consecutive COVID-19 patients admitted to Dubrava University Hospital, a tertiary-level institution, from March 2020 to March 2021. Baseline clinical and laboratory data as well as clinical outcomes of patients were recorded as a part of a Hospital Registry Project. All patients were of white race. All patients had positive polymerase chain reaction or antigen COVID-19 test before hospital admission. During hospital admission, patients were treated according to contemporary guidelines with various exposures to low molecular weight heparin (LMWH), corticosteroids, and remdesivir. The study was approved by the Institutional Review Board of Dubrava University Hospital (2021/2503-04).
Disease severity on admission was determined according to the World Health Organization (WHO) as mild, moderate, severe, or critical (11). In addition to WHO COVID-19 severity classification, the following prognostic scores were evaluated: 1) The COVID-GRAM score (12) was developed to evaluate the risk of critical illness among hospitalized patients with presumed COVID-19. It is based on the presence of x-ray abnormalities, age, hemoptysis, dyspnea, unconsciousness, number of comorbidities, cancer history, neutro-phil-to-lymphocyte ratio, lactate dehydrogenase, and direct bilirubin. Patients are stratified into three risk categories.
2) The Veterans Health Administration COVID-19 (VACO) Index (13) was originally developed to evaluate 30-day mortality in potential COVID-19 patients. It is based on demographic parameters (age, sex) and comorbidities. The score incorporates no actual disease severity. Patients are stratified into four risk categories.
3) The 4C Mortality Score (14) was developed to evaluate the in-hospital mortality of COVID-19 patients. It is based on age, sex, the number of comorbidities, respiratory rate, peripheral oxygen saturation on room air, the Glasgow Coma Score, urea, and C-reactive protein (CRP). Patients are stratified into four risk categories.
4) The CURB-65 score (15) was originally developed to evaluate the mortality of community-acquired pneumonia patients. It is based on confusion, urea, respiratory rate, blood pressure, and age. Patients are stratified into five risk categories.

Statistical analysis
The normality of distribution of numerical variables was assessed with a Kolmogorov-Smirnov test. Numerical variables are presented as median and interquartile range (IQR) and were compared between the groups with a Mann-Whitney U test. Categorical variables are presented as frequencies and percentages and were compared between the groups with a χ 2 test. The receiver operating characteristic (ROC) curve analysis was used to assess the predictive properties of prognostic scores regarding clinical outcomes of interest (30-day mortality, in-hospital mortality, presence of severe or critical disease on admission, need for an intensive care unit, and mechanical ventilation during hospitalization). Kaplan-Meier survival analysis was used, and survival curves were compared between the groups with the Cox-Mantel version of the log-rank test (16,17 Patients' characteristics and risk scores categories stratified according to in-hospital mortality are shown in Table 1. Thirty-day mortality curves for the entire cohort and stratified by the categories of the WHO severity, COVID-GRAM, VACO Index, 4C Mortality, and CURB-65 prognostic scores are shown in Figure 1A-F. All of the investigated prognostic scores significantly distinguished between groups of patients with different prognosis (overall P < 0.001 for all analyses). The WHO COVID-19 severity classification did not differentiate mild from moderate patients, but it significantly differentiated the patients with severe and critical symptoms from each other and from lower-risk groups ( Figure  1B, overall P < 0.001). Thirty-day mortality rates were 4.2%, 5.3%, 35.5%, and 63.4% for mild, moderate, severe, and critical groups, respectively.
The COVID-GRAM identified a low-risk group of patients in whom did no events occurred and who could not be statistically compared with the other groups. Patients belonging to the high-risk group significantly differed in survival from the medium-risk group ( Figure 1C, overall P < 0.001).
The 4C Mortality Score distinguished between four groups of patients with significantly different prognosis ( Figure  1E, overall P < 0.001). Thirty-day mortality rates were 4.1%, 8.8%, 34.4%, and 72.3% for low, intermediate, high, and very high risk groups, respectively.
Comparison of prognostic properties of CoVID-19 prognostic scores regarding different clinical outcomes in the entire cohort The CURB-65 and 4C Mortality Score demonstrated an overall best performance in correctly classifying death-related outcomes. They had the best AUC values of similar magnitude, which were significantly better than those of the other indices for 30-day mortality (AUC 0.761 and 0.761 for CURB-65 and 4C Mortality Score, respectively) and for in-hospital mortality (AUC 0.757 and 0.762 for CURB-65 and 4C Mortality Score, respectively). The 4C Mortality Score and COVID-GRAM achieved the best performance in recognizing patients with WHO-defined severe or critical disease (AUC 0.785 and 0.717 for 4C Mortality Score and COVID-GRAM, respectively). However, neither of the prognostic indices discriminated well between patients requiring intensive care unit treatment or mechanical ventilation.
In this context, the WHO severity classification on presentation achieved the highest, although modest, AUC values (AUC 0.667 and 0.687 for intensive care unit and mechanical ventilation, respectively) ( Table 2).

Comparison of predictive properties of CoVID-19 prognostic scores in subgroups of patients with various disease severity
We further evaluated the performance of different prognostic scores in subgroups of patients with WHO-defined mild or moderate (Supplementary

Independent prognostic properties of different prognostic scores
We analyzed all the investigated prognostic indices stratified by their respective prognostic categories in the Cox regression model for 30-day mortality (Table 3). WHO severe vs mild disease, WHO critical vs mild disease, COVID-GRAM high vs medium plus low risk, 4C Mortality Score  very high vs low risk, and all prognostic CURB-65 categories remained significantly associated with a worse survival and performed independently of each other in distinguishing 30-day mortality. Prognostic properties of the VACO Index and lower-risk 4C Mortality Score categories were redundant as their prognostic categories remained insignificantly associated with survival when controlling for other scores.

DISCuSSIoN
In the current study, all of the investigated prognostic models were able to identify groups of patients with a worse prognosis. However, the scores performed differently in terms of prediction of clinical outcomes of interest, as well as in terms of the number of prognostic categories they were able to distinguish.
Specific COVID-19 scores did not outperform the classical CURB-65 score, developed for community-acquired pneumonia. Also, the predictive properties of specific scores were mostly lower than in the original patient cohorts or other validation studies (10,18), a finding that further highlights the differences among various clinical contexts and the importance of real-life data. The best predictive properties were observed among patients with mild or moderate disease symptoms. The CURB-65 and 4C Mortality Score performed best at correctly recognizing patients with inferior 30-day mortality and in-hospital mortality. Both scores include information on age, respiratory, hydration, and mental status, with the 4C Mortality Score additionally including information on comorbidity burden and inflammation. These two scores comparably distinguished between survival-related outcomes despite the lower number of variables and no information on patient history required for the calculation of the CURB-65 score. The CURB-65 also provides the highest number of prognostic categories (five) compared with all other scores, enabling more precise risk stratification.
The investigated prognostic scores provided additional prognostic information regarding 30-day mortality one to another, with the exception of the VACO Index, whose prognostic properties were redundant when evaluated synchronously with other prognostic scores. The VACO Index is based on age, sex, and the number of comorbidities but does not provide information on the current inflammatory and respiratory status. Its strength is the prediction of the risk associated with future COVID-19 infection, but it may not perform as well as other scores at hospital admission either regarding the prediction of particular outcomes or regarding additional prognostic information. The VACO Index may be improved by adding parameters reflecting the acute state of the patient (19). Since many COVID-19 prognostic scores include information on comorbidities, their use requires profound knowledge on the patient's history, which may not be available in the pandemic working conditions, where incomplete medical records and patients' inability to provide correct history due to confusion are common. Simple and quick-to-obtain biochemical parameters such as red cell distribution width (RDW) and CRP-to-albumin ratio (CAR) provide additional prognostic information to COVID-19 prognostic scores (9,20). Since they are non-specific and may be profoundly affected by comorbidities (like RDW) or COVID-19 associated inflammation (CAR), they represent excellent candidates to be added to the current prognostic scores and to allow more precise risk stratification.
Limitations of the study are the single-center experience, retrospective design, and study period before or at the very beginning of the vaccination program. Our data are representative of a tertiary referral center with a high number of mostly elderly, severe or critical COVID-19 patients with a number of acute or chronic medical conditions. Thus, our results provide a unique overview specific to this clinical context. Our results need confirmation from studies on independent data sets.
In conclusion, complex prognostic scores based on a large number of parameters and comorbid conditions did not achieve better prognostic properties for survival outcomes of hospitalized COVID-19 patients in comparison with a simple CURB-65 prognostic score. The CURB-65 also provides the largest number of prognostic categories (five), allowing more precise risk stratification than other prognostic scores.
Acknowledgments This paper is a part of the project "Registar hospitalno liječenih bolesnika u Respiracijskom centru KB Dubrava"/"Registry of hospitalized patients in Clinical Hospital Dubrava Respiratory center. " Funding None.
Declaration of authorship ML conceived and designed the study; ML, NPŽ, TR, ID, JS, IJ, DF, IK, AJ, NB acquired the data; all authors analyzed and interpreted the data; ML drafted the manuscript; all authors critically revised the manuscript for important intellectual content; all authors gave approval of the version to be submitted; all authors agree to be accountable for all aspects of the work.
Competing interests ML is a statistical editor in the Croatian Medical Journal. To ensure that any possible conflict of interest relevant to the journal has been addressed, this article was reviewed according to best practice guidelines of international editorial organizations. All authors have completed the Unified Competing Interest form at www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare: no support from any organization for the submitted work; no financial relationships with any organizations that might have an interest in the submitted work in the previous 3 years; no other relationships or activities that could appear to have influenced the submitted work.